-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable support for free-threading #1295
base: main
Are you sure you want to change the base?
Conversation
@@ -629,7 +638,7 @@ def parallel_exec_transform_with_prettyprint( # noqa: C901 | |||
warnings: int = 0 | |||
skips: int = 0 | |||
|
|||
with pool_impl(processes=jobs) as p: # type: ignore | |||
with pool_impl() as executor: # type: ignore | |||
args = [ | |||
{ | |||
"transformer": transform, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't transform.context
get clobbered here:
Lines 264 to 271 in 985cec8
# Apart from metadata_manager, every field of context should be reset per file | |
transformer.context = CodemodContext( | |
scratch=deepcopy(scratch), | |
filename=filename, | |
full_module_name=mod_name, | |
full_package_name=pkg_name, | |
metadata_manager=transformer.context.metadata_manager, | |
) |
This is mostly safe with the GIL, so long as python doesn't switch to another thread that also owns a reference to transform
, but in free-threading with the sort of parallel thread pool setup you have here it's much more likely to happen.
In general - how do you want to go about adding multithreaded tests? It looks like there's a decent amount of stateful objects that get passed around and updated. How much should we worry about threads simultaneously updating stateful objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah absolutely - we'll probably have to change how state is passed around between threads. In fact, it might be better to just make a new transformer instance for each file and only pass immutable data structures between threads/processes.
In general - how do you want to go about adding multithreaded tests?
We could run the test suite with something like unittest-ft
, but the main thing I'd look for is the concurrent execution of the same set of visitors/transformers on multiple input files (basically what this PR sketches out), and then any bugs we find we can write targeted tests. How does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does that sound?
Sounds great, that's more or less what I've been doing with pyca/cryptography
this week: pyca/cryptography#12555
Another thing I learned there is moving code form pure python to Rust can also be a way to avoid thread safety issues. If we move state from Python to Rust then we have much more control over concurrent multithreaded access.
I'll look at this next week - thanks for merging my other PR!
Instead of relying on `multiprocessing.Pool`, this PR replaces the implementation of `parallel_exec_transform_with_prettyprint` with `concurrent.futures.ProcessPoolExecutor`
Enable support for free-threading
This PR:
libcst.native
module as free-threading-compatibleThis depends on #1294 and #1289.
Stack created with Sapling. Best reviewed with ReviewStack.